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Personality Outcomes of Client-Centered 


Child Therapy’ 


ELAINE DoRFMAN 


HILE the practice of child therapy 
from the client-centered viewpoint 
has had a parallel development with that 
of adult counseling, research in child 
therapy lags far behind. Thus far, there 
has been no objective study of person- 
ality outcomes of individual client-cen- 
tered child therapy. Such a study would 
need to employ varied measures of out- 
come, since no single perfect criterion 
exists. It would also have to deal with 
the problem of adequate experimental 
controls. The project reported here had 
as its aim the assessment of therapy out- 
comes by personality tests, therapist judg- 
ments, and client follow-up statements. 


HyPoruHesis 
The central hypothesis is that per- 
sonality changes occur during a therapy 


* This is one of a series of studies conducted at 
the University of Chicago Counseling Center, 
with the financial support of the Medical Sciences 
Division of the Rockefeller Foundation, Grateful 
acknowledgment of this support is made. 

This report is based upon a doctoral disserta- 
tion submitted to the Committee on Human De- 
velopment of the University of Chicago. The 
early stages of the research were guided by a 
thesis committee consisting of Carl R. Rogers 
(Chairman), Julius Seeman, and Morris I. Stein. 
The finalecommittee consisted of Carl R. Rogers, 
William F. Soskin, and John M. Shlien. 

Four persons generously gave time to judging 
parts of the data for reliability studies: Natalie 
R. Haimowitz, Wayne J. Oulton, Esselyn C. Rudi- 
koff, and Rolland Tougas. The writer is also 
grateful to Mary D. Mulroy, Principal in the 
Chicago public school system, for many kind- 


hesses. 
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period which do not occur in the same 
child during a no-therapy period, and 
which do not occur in control cases. In 
statistical terms, we test the null hypoth- 
esis that there are no differences sig- 
nificant at the 5°% level between therapy 
and nontherapy changes. While this is 
the only formally stated hypothesis, the 
study’s design implies two subsidiary 
ones: (a) that therapy can be conducted 
by an outsider in a school setting; (6) 
that child therapy is possible without 
parent therapy. 


EXPERIMENTAL DESIGN 


The basic design is of the pretest and posttest 
variety. Two types of experimental contro] are 
used. First, essential personality variables are 
controlled by studying the experimental group 
over a 13-week interval prior to therapy. This 
enables comparis n of test score changes during 
a no-therapy and a therapy period for each 
individual child, in an experiment in perfectly 
matched pairs. Because random assignment of 
cases to separate control and experimental groups 
is not feasible when the number of therapy 
candidates is small, and because the relevant 
variables for personality matching are unknown, 
the own-control method is employed. This has 
previously been done in studies by Bills (4) on 
the effects of play therapy on reading retarda 
tion, and in research on 
counseling reported under 
Rogers and Dymond (21). 

A second type of experimental control is that 
for time, and for this a separate group of sub 
jects is required. Because the length of therapy 
is not predictable, there exists no way, short of 
the clinically undesirable one of interruption 
of the therapy period, to equate the length of 
pretherapy and therapy intervals. Further, our 
pretherapy interval is a summer vacation period, 
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whereas therapy was conducted during the school 
year. Study of control subjects during an in- 
school period is thus required. 

Our attempt to provide a control group inter- 
val equal in length to the therapy period has 
been imperfectly realized. Because all test results 
were to be blindly evaluated at the study's close, 
prior matching of control and therapy cases was 
not possible. Therefore, the average length of 
therapy had to be estimated in planning the 
time of posttesting the control cases. The esti 
mate was 23 weeks, which proved to be 5.5 
weeks too short. However, the time involved is 
closer to the therapy period's 28.5 weeks than is 
the 13-week pretherapy period. 

We tried to match the control and experi 
mental groups for initial test status (pretherapy 
rather than pre-wait in the case of the experi 
mental group). Since more than one test is in- 
volved, plus the factors of age and sex, it was 
not possible to use precisely the same control 
group for each instrument. The control subjects 
thus vary somewhat with the test considered, 
although some overlapping exists. 

We might, of course, have used instead of 
test scores some criterion such as intelligence 
or socioeconomic status to match experimentals 
and controls. This would have permitted the 
use of a single control group throughout. But 
matching requires that ‘the correlation between 
matching and dependent variables be high 
enough to offset the loss in degrees of freedom 
for subsequent data analysis (10, pp. 288-295). 
We judged that test scores were more likely to 
provide such a correlated matching variable, thus 
increasing the efficiency of the experiment. 

The research design also includes a follow-up 
study of 15 experimental cases, re-evaluated a 
year after the therapist had left the school. The 
follow-up interval varies from one year to one 
and one-half years, depending upon when each 
case ended therapy 


SUBJECTS 
All therapy and control cases are from 
one public elementary school located in 


a prosperous middle-class neighborhood 
of Chicago. 


Experimental Cases 


The therapy group consists of 17 chil- 
dren considered at least normally intelli- 
gent, 12 boys and 5 girls whose teachers 
believed them to be maladjusted. Eight- 
een children began therapy, but 1 was 
dropped from the study owing to strong 
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presumption of mental defect, and be- 
cause an error of scheduling assigned him 
to special tutoring and therapy at the 
same time, so that he missed many ses- 
sions. 

Four criteria determined inclusion in 
the therapy group: (a) the teacher's judg- 
ment that the child was among the five 
most maladjusted in his class; (b) the 
school principal's concurrence that he 
was emotionally disturbed; (c) the con- 
sent of at least one parent; (d) age limits 
of g to 12 years. The last ensured that 
all tests would be appropriate for all 
cases. Test scores were not used in selec- 
tion. 


Control Cases 


The controls are not therapy candi- 
dates, but are matched for age, sex, and 
test scores. For the Rogers Test, matching 
of individuals was not possible, but the 
groups are equated for means and vari- 
ances on Total Scores. On the Sentence 
Completion Test, individuals are 
matched for their Mean ‘Adjustment 
Rating (a mean of 10 rated character- 
istics). 


INSTRUMENTS 


Three personality tests were given: one 
objective, one nonverbal projective, and 
one verbal projective. Fifteen therapy 
children also wrote follow-up letters to 
the therapist. 


Rogers Test of Personality Adjustment 


This is a standardized objective-type paper 
and pencil test (20) yielding five indices of mal 
adjustment: Total Score, Personal Inferiority, 
Social Maladjustment, Family Maladjustment, 
and Daydreaming 


Machover Human Figure Drawing Tesi 


This is a nonverbal projective method (16), 
in which drawings of male and female figures 
are interpreted in psychoanalytic terms. While 
given to all subjects, this test remains un- 


analyzed, owing to our failure to develop an ob- 
jective scoring system. 


Sentence Completion Test 


A 2o-item test was assembled for this study, 
based in part upon versions published by Rohde 
(22), Rotter and Willerman (23), and Shor (26). 
Our detailed scoring manual (7, pp. 120-188) is 
not included in this report, but a summary of 
the scoring method and the results of reliability 
studies are given in the Appendix 

The Sentence Completion Test provides two 
kinds of data: (a) a theme and attitude analysis 
based upon the work of Pugh (18, pp. 203-231): 
(b) an adjustment rating scale for 10 personality 
characteristics, adapted from Goldberg (12, pp 
11-51) and Reader (19, pp. 94-132). The Mean 
Adjustment Rating, an average of ratings on the 
10 characteristics, is our index of general adjust 
ment status 


Follow-l p Letters 


This fourth 
Axline (2) 


instrument was adapted from 
Fifteen children of the 17 therapy 
cases were asked to write to the therapist telling 
what thev remembered of the sessions and how 
they were currently getting along. The manual 
of directions for analyzing the letters (7, pp 
i8g-192) is not included in this monograph, but 
a summary of the method and reports of relia- 
bility studies are given in the Appendix. 

The letters are rated along the two dimensions 
of memories of the therapy sessions and reports 
of life status. Refer to the Appendix for defini 
tions of these terms. 


EXPERIMENTAL PROCEDURES 
Therapy 


All 17 cases were seen individually at school 
by the writer, in a large basement room formerly 
used for classes. Therapy was conducted within 
the framework of the client-centered approach, 
following principles outlined by Axline (1) and 
Dorfman (6). Sessions were held weekly, except 
for one child who came twice weekly at his own 
request. Three children asked to bring 
friends to one or more sessions were permitted to 
do so. The children were told that they must 
come 10 times, after which they could stop if 
they wished. One school year was the maximum 
allowed. As each case closed, the therapist rated 
it as relatively successful or relatively unsuccess 
ful. without knowledge of test results 


who 


Testing 


No tests were given by the therapist. For the 


experimental group, pre-wait and pretherapy 
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projective tests were given in class by teachers. 
The Rogers Test was given by a graduate psy- 
chology student to small groups partly composed 
of classmates considered well-adjusted by their 
teachers. In this way, we tried to avoid singling 
out the therapy children for special attention 
other than therapy 

Posttherapy tests were given individually bv 
the same psychology student who had adminis 
tered the pre-wait and pretherapy Rogers Tests 
rhe interval between termination of therapy and 
posttesting was never than a week, The 
change from group to individual testing at this 
point introduces; an unavoidable 
mental variable 


more 


nonex peri- 


Follow-up tests were given in small groups 
to therapy cases and others by a new psychome 
trist, the previous one no longer being available. 
Two of the 17 therapy children were no longer 
enrolled in the school and were therefore not 
followed up 

For the control cases, projective tests were 
given in class by teachers, and the Rogers Test 
was given in small groups by the same psy- 


chometrist who tested the experimental group. 
Follow-Up Letters 
After the follow-up tests, each therapy child 


was sent 
to the 


a letter at school, asking him to write 
therapist, along the lines 
the preceding section on instruments 


outlined in 
The letters 
were in individually addressed sealed envelopes 
All 15 children who were still enrolled in the 
school answered 


Data Analysis 


Since the investigator-therapist is also princi 
pal data analyst, care was required to rule out 
bias in projective test analysis. The first prob- 
lem was formulation of a scoring method for 
the Sentence Completion Test. To be most use 
ful, the method should be developed upon a 
group similar in important respects to that to 
which it would later be applied. Schoolmates 
of the research group were thus chosen 

After the first testing of therapy candidates and 
classmates, records of the former 
until over two years later 


were set aside 
From the rem ining 
166 records, a random sample of 50 was drawn, 
and identifying data removed. The scoring man- 
ual is based upon study of these 50 cases. Each 
completion was typed on an index card, and 
the 1,500 cards sorted in various ways as different 
classificatory tried Thus, the 
range of responses to any item could be studied 
in 50 cases, so that a fair 
differences guided the 
manual 


schemes were 


notion of individual 
writing of the scoring 


A vear after the last therapy case ended, fol 


‘ 
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Testing” - - 
Total Personal 


Pre-Wait 


Pretherapy 41.40 11.99 
Posttherapy 35.58 | 9.99 
Follow-Up 35 | 


Pretest 39.27 14.33 
End-Test 37.00 12.32 


* Higher scores mean greater maladjustment 


low-up tests were obtained, and at this point 
analysis of experimental records began. All ex- 
perimental and control group protocols were 
assigned code numbers from a table of random 
numbers, and names and dates removed. This 
allowed blind analysis, without knowledge of 
whether a given record was from an experi 
mental or control case, or when the test had 
been given. 


Resutts Orner THAN Test OQurcoMes 


Despite the chance to stop after 10 
compulsory sessions, 10 cases were still in 
therapy when arbitrarily closed by the 
end of the school year. Posttherapy tests 
on these cases might also be viewed as 


RoGrers Test MEAN Scort 


Period! 
Potal Personal 


Pretherapy 
Therapy 5.88 2.00 
Follow-Up 


Control 2.27 2.01 
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TABLE 1 
RoGers Test MEAN Scores AT EACH TESTING* 


Rogers Scales 


Experimental Group 


Control Group 


>» N=17 for all testing except follow-up, where - 


CHANGES WITHIN EACH PERIOD* 


Experimental Group 


Control Group 


Social Family 


Daydreaming 


w 

8 


14.41 8.41 2.12 
14.88 7-44 2.36 


‘in-process’ measures, since therapy 
would probably have continued had the 
therapist remained available. 

The average number of sessions is 19, 
with a range of 11 to 33 contacts. These 
occurred over a period averaging 28.5 
weeks, with a range of 12 to 34 weeks. 
The ¢ test shows no sex difference in 
therapy length. 

len cases were rated by the therapist 
as relatively successful and seven as rela- 
tively unsuccessful. The judgment of 
“relatively successful” means only that 
some degree of movement was noted, 


rABLE 2 


Rogers Scales 


Social Family 


Daydreaming 


4 —0o.76 —0.65 


0.47 —0.97 0.24 


* Negatively signed score changes indicate improved adjustment 
* N=17 for all periods except follow-up, where .V 


: 4 
41.76 12.54 15. 
17 
14 
1§.4 
| 
} 
1.74 1.02 0.47 
0.75 0.50 0.45 
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rABLE 3 
RoGers Test MEAN Scores* aT EACH TESTING OF SUCCESSFUL AND UNSUCCESSFUL SUBGROUPS 


Rogers Scales 
resting —~ — 
Total Personal Social Family Daydreaming 


Successful Cases? 


Pre-Wait 42.19 12.72 14.75 2 


75 11.92 2.80 
Pretherapy | 43-29 11.89 18.60 9.5 3.30 
Posttherapy 34-34 7.89 15.30 9.15 2.00 
Follow-Up 39-11 8.57 18.46 8.80 3.28 
Unsuccessful Cases*® 
Pre-Wait 41.14 12.28 16.43 8.00 4.43 
Pretherapy 38.86 12.14 15.14 Q.00 2.57 
Posttherapy 37.306 13.00 13.86 7.64 2.86 
Follow-Up 29.76 10.12 10.94 6.30 2.40 


* Higher scores mean greater maladjustment. _ 
>» N=10 for all testings of successful cases 
© N=7 for all testings of unsuccessful cases, except for follow-up, where N 


5. 


the nature of which varied from case to 


RESULTS OF THE RoGrERs TEsT 
case. Chi square shows no significant as- PERSONALITY ADJUSTMENT 
sociation between sex and therapist 
judgment, nor between method of termi- ing for all Rogers scales. Higher scores 
nation (by therapist or child) and thera- mean greater maladjustment. Table 2 
pist judgment. Neither is there any asso- shows the 
ciation between sex and method of termi- 
nation. Further, the ¢ test shows no tend- 
ency for therapist judgment to favor 
either longer or shorter cases. 


lable 1 gives mean scores at each test 


score changes within each 
period of the study. In Tables 3 and 4, 
the results are separately presented for 
subgroups judged by the therapist to be 
relatively successful and unsuccessful. 


rABLE 4 


RoGers Test MEAN ScorRE CHANGES* WITHIN EACH PERIOD FOR SUCCESSFUL 
AND UNSUCCESSFUL SUBGROUPS 


Re gers x ales 


Period 
Total Personal Social Family Daydreaming 


Successful Cases 


Pretherapy 1.10 —0.83 3.85 2.42 ©.50 
Therapy —8.05 ~4.00 — 3.30 0.35 ~1.30 
Follow-Up 4-77 0.68 3.16 0.35 1.28 


Unsuccessful Cases* 


Pretherapy —2.28 0.14 1.29 1.00 1.86 
Therapy -1.50 0.86 —1.28 26 ©.29 
Follow-Up -9.06 3.00 —4.0 1.20 


* Negatively signed score changes mean improved adjustment 
>» N=10 for all periods for successful cases 


¢ N=7 for all periods for unsuccessful cases, except for follow-up, where N= s. 
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Score changes are studied by the ¢ test. 
We begin by considering the therapy 
period. Then, we compare its results with 
those of the pretherapy period, the sepa- 
rate control group, and the follow-up 
period. 


Changes During Therapy Period 


For the total group of 17 cases, two of 
five indices show significant improve- 
ment during therapy. These are Total 
Score (p < 
(p < .05). There is a possible trend to 
improvement 
(p< See 
change. 


or) and Social Maladjustment 


on Personal Inferiority 


10). lable 2 for amounts of 

Subgroup analysis shows no significant 
differences between changes of boys and 
girls, nor between self-closed and thera- 
pist-closed cases. But cases judged success 
ful improve significantly more than un- 
successful ones on Total Score (p < .05) 
and Personal Inferiority .05), al 
though their pretherapy scores do not 
differ. See amounts of 


change. Taken by themselves, the unsuc 


Table for 
cessful cases fail to change significantly 
from their own pretherapy mean on any 
index, but the successful cases improve 
significantly on Total Score (p < .ot), 
Personal Interiority (p < .o1), and Social 
Maladjustment (p < .o2). This means 
that the total group's therapy improve 
ments mainly reflect changes by the cases 
judged successful. 

Returning to the total group of cases, 
we find no relationship between length 
of therapy and amount of improvement. 
Whether length is expressed in number 


of sessions or in number ol 


weeks, ¢ 
.28-to .39, all insignificant. 


Notther is there any reliable relation be 


varies from 


Throughout this report, p and W are under 
Kendall's b 
(14), Which correct for tied ranks 


stood to be derived from 


formulas 
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tween number of contacts and degree ol 
pretherapy maladjustment on the Ragers 
Fest. Rho varies from —.20 to .46, none 
significant at the .o5 level. But trends to 
Total 
Daydreaming 


significance may exist for 
(p < .10) 


Score 
and (p < .o6). 


All of these ¢’s may 


since variability in number of sessions is 


be spuriously low, 


limited by the arbitrary clos- 
ing of 10 incomplete cases. 


artificially 


Changes During Pretherapy Period 
A 


Therapy period changes cannot be 
ascribed to the sessions unless it can be 
shown: that similar changes do not occu 
in the group during the pretherapy ot 
control period. None of the pretherapy 
changes shown in Table 2 are significant, 
so that the hypothesis of spontaneous 
improvement is untenable for this group 
as a Whole. 

Subgroup analysis shows no sex dif- 
ferences in amount of change, nor are 
there differences in the relative changes 
of self-closed and therapist-closed_ cases. 
The latter means that the subsequent 
tendency to discontinue therapy cannot 
be ascribed to spontaneous improvement 
in these cases prior to therapy. 

\s was found true of the therapy 
period, there is a difference in the rela 
tive changes of cases later judged suc- 
cessful or unsuccessful in therapy. Among 
the changes shown in [Table 4, only the 
difference on Family Maladjustment. is 
significant at the .o5 level, and favors the 
successful cases. Reverse trends may exist 
for Social Maladjustment and Daydream- 
ing, where differences significant at the 
10 level favor the unsuccessful cases. 
However, the unsuccessful subgroup was 
initially more maladjusted on Daydream- 
somewhat 


ing (p < .Of), that its 


greater improvement during the pre 


therapy interval may reflect’ regression 
effects in the test. 


Considered by themselves, the unsuc- 
cessful cases fail to depart significantly 
from their own pre-wait mean on any in- 
dex, although a trend to improvement 
is found on Daydreaming (p < .10). In 
contrast, the successful cases improve on 
Family Maladjustment (p < .05), and 
grow worse on Social Maladjustment 
(p < .o5). Taking only differences signifi- 
cant at the .o5 level, we note that the 
successful cases change on two of five 
Rogers Test indices during the _pre- 
therapy period, showing gain on one and 
decrement on one. The unsuccessful cases 
never change significantly. ‘This implies 
that it is easier for therapy to channelize 
existing change into the direction of bet- 
ter adjustment, insofar as the Rogers 
‘Test measures it, than to institute change 
where it is not already occurring to a sig- 
nificant degree. 


Comparison of Changes During 
Pretherapy and Therapy Periods 


We saw above that for the total group, 
certain reliable changes occur during 
therapy, but not within the pretherapy 
interval, The question now is whether 
therapy score shifts are significantly 
greater than pretherapy ones. Among 
the changes shown in Table 2, that on 
Potal Score improves significantly more 
(p < .05) during therapy than prior to it. 
\ trend to significant difference on Social 
Maladjustment (p < .10) also favors the 
therapy interval. 

No subgroup comparisons are made 
for sex or termination method, since no 
within-period differences exist. But hay 
ing found such differences in the success- 
ful and unsuccessful cases, it is appropri- 
ate to consider them further. 

The 10 cases judged successful show 
therapy score improvements — reliably 
greater than pretherapy changes on 
Fotal Score (p <.05) and social Mal- 


CLIENT-CENTERED CHILD 


THERAPY 


adjustment (p < .o1). See Table 4 for the 


‘amounts involved. The seven unsuccess- 


ful cases show no difference between 
their therapy and pretherapy changes. 
Thus, the total group's greater improve- 
ments during therapy as compared to 
the pretherapy interval are largely due 
to gains of cases which the therapist 
viewed as relatively successful. 


Changes in the Separate Control Group 


While the total experimental group 
shows no reliable pretherapy gain, it 
might be argued that the interval is too 
short, and that in a vacation period chil- 
dren are deprived of possible gains from 
mere school attendance. The separate 
control group, studied over a 23-week 
school period, shows no reliable gain on 
any Rogers index. A trend to improve- 
ment may exist on Personal Inferiority 
(p < .10). On the whole, though, it seems 
fair to conclude that time and _ school 
attendance, within our given limits, do 
not improve test performance. 


Comparison of Control Group and 
Therapy Changes 


The experimental group during ther- 
apy improves significantly more than the 
separate controls on Social Maladjust- 
ment (Pp < .05). A trend to significance 
on Total Score (p < .10) also favors the 
experimental group. Compared with the 
results of the therapy versus own-control 
period tests, we find that it is the same 
two indices which favor the therapy in- 
terval, although their significance levels 
are reversed here. Amounts of change 
within each period are found in ‘Table 2. 


Comparison of Control Group and 
Pretherapy Changes 


Here we compare test change: during 
two no-therapy intervals. There are no 


reliable differences between them. Thus, 


/ 


a longer interval between testings does 
not result in greater change, nor does 
the in-school out-of-school variable pro- 
duce a difference. Likewise, the data indi- 
cate that children recommended for 
therapy are no more apt to show test 
changes prior to therapy than a group 
of noncandidates. 


Changes During Follow-Up Period 


Are the therapy gains lasting, or are 
they transitory and dependent upon the 
continuance of the therapeutic relation- 
ship? We answer this by comparing the 
posttherapy and follow-up scores of the 
15, cases still in the school at follow-up 
time. Posttherapy and follow-up scores 
of these 15 children do not differ sig- 
nificantly. Thus, therapy gains are main- 
tained. This is true even though g of the 
15, cases were arbitrarily closed. Further, 
compared to pretherapy scores, follow-up 
tests are reliably better on Total Score 
(p< 
< .oz). A trend to significance is found 
on Family Maladjustment 


oz) and Personal Interiority 


This 


indicates absence of regression to pre 


which also tavors the follow-up test 


therapy status upon discontinuance of 
the therapy sessions. 

There are no sex differences in follow 
up changes, which agrees with results of 
the pretherapy and therapy periods. But 
comparison of follow-up changes among 
self-closed and therapist-closed cases 
shows a difference on Social Maladjust 
ment (p <.05) favoring the self-closed 
cases. The subgroups do not differ on 
posttherapy Social Maladjustment. ‘Thus 
it is not until the follow-up period that 
a difference in changes made by self- 
closed and therapist-closed cases occurs. 
But the self-closed subgroup’s apparent 
improvement over its own posttherapy 


mean is not significant for any index. 
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lhe apparent regression of the therapist- 
closed group also lacks reliability. There 
may be a trend to significant regression 
on Social Maladjustment (p < .10) in 
the therapist-closed cases, but whether 
a longer interval would 
strengthen or reverse this trend is prob- 
lematical. 


follow-up 


Changes of successful and unsuccessful 
cases differ on Total Score (p < .o2), Per- 
sonal Inferiority (p < .05), and Social 
Maladjustment (p < .oz), and all these 
differences favor the unsuccessful sub 
group. A trend to significance on Day- 
dreaming (p < .10) also favors the unsuc- 
cessful cases. The Personal Inferiority 
difference may reflect regression effects 
in the test, since there the subgroups dif- 
fer on posttherapy status. But taken by 
themselves, neither subgroup changes 
reliably from its own posttherapy mean 
on any index. However, while the regres- 
sion of successful cases is never closer to 
significance than the .2o0 level, improve- 
ment of unsuccessful cases reaches the .10 
Total Personal Inferi- 
ority, and Social Maladjustment. Here 
again, the question of the possible effect 
of a 


level on Score, 


longer follow-up period can be 


raised but not answered by this study. 


Comparison of Follow-Up and 
Therapy Period Changes 


that the 
group as a whole undergoes significant 
changes during the therapy interval, but 
not during the follow-up period. But are 
the between-period differences themselves 
reliable? 


It has been seen thus far 


The data show no significant 
difference between therapy and follow-up 
period changes for the group of 15, cases. 
Thus, a longer interval does not neces- 
sarily result in greater score change. 
The 


differences between their follow-up and 


10 successful cases show reliable 


aa 
4 3 


therapy period changes on Total Score 
(Pp < .o2), Personal Inferiority (p < .o5), 
and Social Maladjustment (p < .05), and 
these favor the therapy period. The 
movement is toward improvement dur- 
ing therapy and decrement across the fol 
low-up. But we noted earlier that their 
therapy improvements are reliable and 
that their follow-up decrements are not 
This means that for this particular sub 
group, the essential changes occur during 
therapy and not after it. 

The five unsuccessful cases followed 
up are quite different. Their follow-up 
and therapy period changes show no 
significant difference. Further, we noted 
above that their therapy changes never 
approach significance, while their follow- 
up period improvements attain the .10 
level on Total Score, Personal Inferiority, 
and Social Maladjustment. It looks as 
though we may possibly have beginnings 
of change during the follow-up period, 
but whether a longer interval would 
confirm this is uncertain. 


Comparison of Follow-Up and 
Pretherapy Period Changes 

This comparison involves two no 
therapy intervals of the experimental 
group. There are no significant differ 
ences between them. ‘The longer follow 
up period does not result in greater 
change. This answers the question posed 
by the relative brevity of the pretherapy 
period in comparison to the therapy 
intervai. Our findings render unlikely 
the hypothesis that the superiority of 
the changes during therapy is attributa- 
ble to the time disparity. 

Although there are within-period dif 
ferences in changes of successful and un- 
successful cases, neither subgroup shows 
any reliable difference between its own 
follow-up and pretherapy changes. 
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Comparison of Follow-Up and Control 
Group Changes 


Here we compare changes during two 
no-therapy intervals for two different 
groups of subjects. There are no signifi- 
cant differences between changes during 
the tollow-up period and those of the 
control group. Thus, as in the case of the 
pretherapy versus separate control group 
comparison, a time period about twice as 
long does not result in greater change 
on the Rogers Test. 


RESULTS OF THE SENTENCE 
COMPLETION [TEsT 

Themes and Attitudes (see \ppendix 
for descriptions of these) show only ran- 
dom fluctuations throughout the study. 
They appear to be highly stable meas- 
ures, affected neither by time nor by 
therapy. This may reflect either the use 
of insensitive categories or true stability 
of underlying function. 

Since there are no changes in Themes 
and Attitudes, we shall confine ourselves 
to study of the Adjustment Ratings. As 
Table g shows these to be correlated 
measures, we shall describe our results 
in terms of the Mean Rating, a general 
adjustment - index. Presentation — of 
changes by individual scales would falsely 
imply a broader area of personality 
change than is justified by our overlap- 
ping measures, 

Table 5 gives group averages on Mean 
Rating at each testing and also changes 
within each period of the study, for the 
total experimental and control groups. 
In Table 6, the results are separately 
presented for subgroups judged by the 
therapist to be relatively successful and 
relatively unsuccessful. All differences 
are evaluated by the ¢ test. As with the 
Rogers Test, we begin with results of the 
therapy period. These are then compared 
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Experimental Group 
Mean Rating 


Pretherapy 
4 


Posttherapy 
54 3.24 


Pretherapy Therapy 
0.40 1.30 
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at Each Testing 


Mean Change® Within Each Period 


TABLE 5 
SENTENCES TEST MEAN ADJUSTMENT RATING AT EACH 
CHANGES WITHIN Each PERIOD" 


TESTING AND 


Control Group 


Follow-| p Pretest 


3.23 4-53 


End-Test 
4.45 


Follow-Up Control 


»2 0.08 


* N=17 for all cases except follow-up, where V = 1 


» Higher ratings mean greater maladjustment 


with those for the pretherapy period, the 
separate control group, and the follow-up 
period, 


Changes During Therapy Period 


During therapy, there is a significant 
improvement in general adjustment with 
in the total group of cases (p < .oo1). 
Subgroup analysis shows no differences 
in relative changes of boys and girls, self 
closed and therapist-closed cases, or cases 


judged successful and unsuccessful. Un- 


TABLE 6 


SENTENCES TEST MEAN ADJUSTMENT RATING 
vr Each TESTING AND CHANGES WITHIN 
Each PERIOD FOR SUCCESSFUL AND 
UNSUCCESSFUL SUBGROUPS* 


Mean Rating” at Each Testing 


Pre- Pre- Post- | Follow- 

Wait therapy therapy Up 
Successful 4.11 4.76 2.26 3.20 
Unsuccessful 4.04 4.21 3.93 3.30 


Mean Change® Within Each Period 


Pretherapy) Therapy Follow-Up 
Successful 0.65 1.51 0.05 
Unsuccessful 0.17 1.00 0.4 


* N=10 for successful cases. N=7 for unsuc- 
cessful cases, except for follow-up, where N =5 
» Higher ratings mean greater maladjustment 
© Negatively signed rating changes mean im 
proved adjustment. 


* Negatively signed rating changes mean improved adjustment 


like 


\djustment Ratings bear no relationship 


the Rogers Vest, Sentences Mean 


The absence of 
subgroup differences may be a function 


to therapist judgment. 


ol the relatively consistent trend toward 
better adjustment. Where nearly every- 
one improves, there is less chance for 
subgroup differences to emerge. 

In agreement with the Rogers Test, we 
again find relationship between 
length of therapy and amount of im 
provement. Whether therapy length is 
expressed in number of sessions or in 
number of weeks makes no difference, the 
.o8 and .o8. Neither 


is there any relationship between num- 


respective being 


ber of therapy sessions and degree of 
Mean 
\djustment Rating, as seen in ¢ 10. 


pretherapy maladjustment on 
\s noted for the Rogers Test, these cor- 
relations may be spuriously low on ac 
count of artificially limited variability in 
number of therapy sessions. 


Changes During Pretherapy Period 

The total group shows significant ad 
justment decrement in the pretherapy 
interval (p < 
Mean Rating at its close (Table 5). 


.02), as seen in a higher 


Subgroup analysis indicates no reliable 


difference between changes of boys versus 


A 
| 
Pre-Wait 
4.05 


self-terminated 


girls, 


versus therapist- 
terminated cases, and cases judged suc- 


cessful or unsuccessful. 


Comparison of Changes During 
Pretherapy and Therapy Periods 

While the therapy interval is char 
acterized by significant improvement on 
Mean Adjustment Rating and the pre 
therapy period by significant decrement, 
we must ask whether a reliable between- 
period difference exists. Mean Rating im 
provement during therapy ts 
greater than 


reliably 
pretherapy change (p 
< .OO1). 

Subgroup analyses are not made, since 


there are no within-period differences. 


Changes in the Separate Control Group 


Although therapy period improvement 
clearly outweighs pretherapy change, it 
must be kept in mind that the prether- 
apy interval is shorter and is an out-of- 
school The 
studied over a longer in-school 
period, shows no reliable change in Mean 
Rating. 
period 


period, separate control 


group, 


Once again, a longer in-school 


does not in itself result in im- 


proved adjustment, 
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Comparison of Control Group and 
Pretherapy Changes 

This 
therapy intervals which differ in time 
span and in the in-school out-of-school 


comparison involves 


two no- 


variable. The separate control cases im 
prove more on Mean Rating than do the 
experimental cases) prior therapy 
(p <.o1). But we saw above (Table 5) 
that the controls show an insignificant 
improvement, while the experimentals 
show a significant decrement, Thus, the 
reliable difference between their changes 
arises from their movement in opposite 
directions It is theoretically possible 
that, given a longer in-school pretherapy 
period, the experimental group might 
show improvement rather than decre- 
ment in adjustment on Mean Rating. 
Our data cannot answer this question, 
but it is not crucial, because we saw in 
the immediately preceding section that 
the therapy improvement is reliably bet 
ter than that of the control group, From 
this we conclude that even if a hypo 
thetical longer pretherapy period re 
sulted in equivalent pretherapy and con 
trol group changes, these would still be 


less than those of the therapy period. 


Comparison of Control 


Group and 


Therapy Changes 


While the therapy group improves 
during therapy and the control group 
does not, we must still ask whether 
changes 


the 
therapy are reliably greater. 
Therapy improvement on Mean Rating 
is significantly greater than that of the 
group of separate control cases (P < .001). 
This agrees with the results of the ther- 
apy versus own-control comparison. It 
indicates that the greater improvement 
during therapy as compared to the pre- 
therapy interval cannot be due to dura- 
tion or to school-period, 


Changes During Follow-Up Period 


While therapy Mean Rating improve 
ment is clearly greater than pretherapy 
and control group gains, its durability 
has to be established. No reliable differ 
ence exists between posttherapy and fol- 
low-up Mean Rating of 15 cases followed 
up. Thus, the therapy gain is maintained 
after the interviews cease. Moreover, fol- 
low-up Mean Rating is significantly bet- 
ter than at pretherapy (p There 
is thus no regression to pretherapy status 
upon discontinuance of therapy. 

Subgroup analysis shows no reliable 
contrasts 


OO1), 


between 


follow-up period 


/ 
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changes of boys versus girls, self-closed 
versus therapist-closed cases, nor in those 
of cases judged to be successful or un- 
successful, 


Comparison of Follow-Up and Therapy 
Period Changes 


Thus far, we have seen that Mean Rat 
ing improves significantly during ther- 
apy, but not across the follow-up inter- 
val. We now find a reliable between- 
period difference which favors the 
therapy period (p < .o1) for the 15 cases 
followed up. This means that the basic 
change in these cases occurs during 
therapy and not after it, even though the 
follow-up period is longer. 

Subgroup analyses are not made, since 
there are no within-period differences 
for any subgroup. 


Comparison of Follow-Up and 
Pretherapy Period Changes 

The comparison here is between 
changes during a short no-therapy inter- 
val just prior to therapy and a longer no- 
therapy interval just after therapy. For 
the 15, cases followed up, there is no re- 
liable difference in Mean Rating changes 
between pretherapy and follow-up peri- 


ods. Once again, a 


longer in-school 
period does not result in greater improve- 
ment. This is further evidence that the 
superiority of the therapy period gains, 
relative to those of the brief pretherapy 
period, cannot be explained by the time 
difference. 

Subgroup analyses are not offered be- 
cause there are no within-period sub- 
group contrasts on Mean Rating changes. 


Comparison of Follow-up and Control 
Group Changes 


Here we compare changes during two- 


no-therapy intervals, for the  experi- 
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mental and control groups. There is no 
reliable difference in Mean Rating 
changes between follow-up and control 
group, and this despite the fact that the 
former period covers a time interval at 
least twice as long. 


RESULTS OF THE FoLLOw-Up LETTERS 


len letters recall the therapy sessions 
with pleasure and are thus classified 
Positive for therapy memories; two are 
hostile or Negative; three are noncom- 
mittal or Neutral. A chi-square test of 
the hypothesis of zero difference among 
categories results in a value of y? = 7.60 
(p < .05). Thus, the obtained distribu- 
tion, in which positive memories pre- 
dominate, differs significantly from 
chance expectancy, 

Nine letters exhibit a positive sense of 
well-being and are therefore classified 
Positive for reported life status; two € - 
press unhappiness or failure and are 
classified Negative; four are noncommit- 
tal or Neutral. A chi-square test of the 
hypothesis of zero difference among cate- 
gories gives = 5.20 (p <.10). Thus, 
the obtained distribution does not differ 
significantly from chance expectancy, al- 
though we may speak of a trend to sig- 
nificance. We may note that two of the 
four cases classified Neutral for reported 
life status fail to supply any information 
on this question. Their letters are con- 
fined to therapy memories, and our pro- 
cedure in cases of failure to respond has 
been to classify the letter automatically 
as Neutral. 

Ihe agreement between therapy mem- 
ories and reported life status classifica- 
tions is examined by chi square. There 
are six agreements and nine disagree- 
ments; with three judgment categories, 
chance agreement is one-third. Testing 
the deviation from expected agreements, 
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lesting 


Potal Personal 
Pre-Wait —.14 23 
Pretherapy .18 ob 
Posttherapy — .26 31 
Follow-Up — .06 2 


= 0.30 (p<.7o). The retrospective 
view of therapy is thus independent of 
reported current life status, 

By inspection, the letters of the follow- 
ing subgroups show no differences: (a) 
cases having above and below the aver- 
age number of therapy sessions; (0) self- 
closed and therapist-closed cases; (c) 
cases judged successful and unsuccessful; 
(d) boys and girls. No statistical tests are 
made on 


these enumeration be- 
cause frequencies are too small for chi 
square; Yates’ correction is inapplicable, 
as more than one degree of freedom is 
involved. 


INTERRELATIONSHIPS AMONG MEASURES 


The three preceding sections present test re 
sults taken one at a time. We now examine the 
extent of agreement or disagreement among 
them. Comparisons are limited to data of the 
experimental group. In the section on experi 
mental design, we indicated that it was not 
possible to use precisely the same control cases 
for the Rogers and Sentence Completion tests, so 
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* N=17 for all testings except follow-up, where 
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TABLE 7 
RHO CORRELATIONS BETWEEN SENTENCES MEAN ADJUSTMENT RATING AND 
OGERS SCALES AT EACH TESTING OF EXPERIMENTAL GROUP* 


Social Family Daydreaming 
2¢ I O38 
22 OO 
38 I 1¢ 


1 
si 


that no between-test comparisons can be made 
for them. In addition, only the therapy children 


wrote follow-up letters , 


Relationship Between Scores on Rogers 
and Sentence Completion Tests 


Test scales 
and Sentences Mean Adjustment Rating at each 
of four Table 7, which 
Contrary to ex 
pectation, 16 of the 20 p’s are negative. By the 
one-tailed Sign 


Rho correlations between Rogers 
testings are given in 
shows no significant coefhicients 


Test, this is significant at the .o1 
level. These results, of course, are no estimate of 
test agreement in an unselected sample. By study 


ing children whom teachers considered malad- 
justed, we have, by definition, tapped one ex- 
treme of the curve. This may well have dis 


torted any relationships between the Rogers and 
Sentences tests, 


Comparison of Changes on Rogers and 
Sentence Completion Tests 


Changes on each Rogers Test scale are com 
pared with those on Sentences Mean Adjustment 
Rating for each period of the study. Rho coef- 
ficients are given in Table 8, which indicates lack 
relationship between measures of 


change 
Since 15 


comparisons are made, the one signifi 
cant negative coefficient may be a chance effect 


TABLE 8 
CORRELATIONS BETWEEN CHANGES ON SENTENCES MEAN ADIUSTMENT RATING 
AND RoGersS SCALES WITHIN EACH PERIOD OF EXPERIMENTAL GROUP* 


Correlations with Each Rogers Scale 
Period 


Total Personal Social Family Davdreaming 
Pretherapy 26 48 23 50 39 
lherapv 17 28 2¢ 
Follow-Up 25 - 02 Os 


22 
3 5 32 


14 


* N=17 for all periods except 


follow-up, where 
p<.o5. 


V=15. 
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Comparison of Follow-Up Letters 
and Tests 


In order to compare the follow-up letters and 
tests, a numerical! score for the letters is required. 
It is based upon the judgment method outlined 
in the section on instruments, since the relia- 
bility of this is known. For both therapy mem 
ories and reported life status, a value of one is 
assigned to a letter judged Positive, a value of 
to a Neutral letter, and a value of three 
to a Negative one. Since each letter is evaluated 
for two factors, the vary from two 
to six. The i5 letters are then ranked for degree 
of positive affect. Obviously, there are 
ties, but allowance can be made for 
pp. 80-89). 

Since the letters constitute a very crude index 
of adjustment status, we compare them only 
with our other indices of over-all status, Rogers 
Total Score and Sentences Mean Adjustment 
Rating. 

The 15 cases are ranked at follow-up for 
Letters, Sentences, and Rogers Test status, and 
the rankings simultaneously compared by the W 
statistic. W = 47 (p< .20), indicating that if 
each measure taps some aspect of personal 
adjustment, it samples a different facet of the 
criterion universe. This is possible when so 
complex a criterion as adjustment is involved 


two 
scores can 


many 
this (14, 


DISCUSSION 


The absence of significant relationship 
between the therapist's judgment of out 
come and the duration of therapy is some- 
what surprising, since one might expect 
that a client’s willingness to continue 
therapy would influence the therapist's 
view of his progress. That this is not 
the case here may reflect idiosyncrasy of 
judgment, inasmuch as the work of a 
single therapist is involved. This might 
explain the from Cart- 
wright’s study of 78 adult client-centered 
cases (5), where a curvilinear relation- 
ship, positive at both extremes of length, 
existed between length and judged suc- 


discrepancy 


cess. His results are supported by those 
of Taylor (28), in a study of gog analyti- 
cally oriented cases. While it may be that 
length of therapy is a less significant di- 
mension 


among children 


than 


among 


adults, a child therapy study involving 
several therapists would be required to 
test this. Moreover, our arbitrary closing 
of 10 cases, dictated solely by conveni- 
ence, automatically restricted variability 
in therapy length, thus obscuring what- 
ever correlation might actually exist be- 
tween length and other factors, 

\bsence of sex difference in judged 
outcome agrees with Cartwright’s find- 
ings on adult cases (5). But, in contrast 
with our results, Dymond (8, 9) found 
test changes which favor adult females. 
Whether this difference be- 
tween children and adults or a difference 


reflects a 


between measuring instruments remains 
unknown. However, our lack of sex dif- 
ference in test outcomes also disagrees 
with a study of nondirective group play 
therapy by Fleming and Snyder (11). 
They found greater Rogers Test im- 
provements among girls. In their study, 
both sexes were seen by one female ther- 
apist. The difference from our study, 
also employing a single female therapist, 
may mean that the therapist’s sex plays a 
different role in group and individual 
therapy. The personality of the particu- 
lar therapist involved may also be a fac- 
tor. These are moot points at present, 
but it is possible to design experiments 
capable of answering them. It should 
also be kept in mind that in our study 
boys outnumbered girls 12 to 5, so that 
sex comparisons are based upon a very 
small group of girls, No attempt was 
made to include an equal number ol 
each sex. Boys are more frequently re- 
ferred, and it was desired that the experi 
ment reflect the conditions of therapy as 
ordinarily practiced. 

Our Rogers Test results for the ther- 
apy period contradict those of Seeman 
and Edwards (25) on group child therapy 
with maladjusted retarded readers. Their 
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therapy cases grew worse on Rogers 
Total Score, the decrement very nearly 
being significantly greater than that of 
the control group. Their study involved 
daily sessions with a teacher-therapist for 
four months, in contrast to our own con- 
ditions of weekly sessions with a non- 
teacher therapist over an average of 28.5 
weeks. It is unlikely that the time or 
group therapy differential is responsible 
for the discrepancy between their results 
and ours, since Fleming and Snyder's 
(11) group therapy, covering an even 
briefer period of six weeks, did not 
result in Rogers Test decrement. But if 
the time and the group-individual factors 
are not per se explanatory, this does not 
rule out their possible interaction effects 
with other conditions such as reading re- 
tardation. 

What sort of children do teachers refer 
for psychotherapy? Referral comments as 
well as observations of the therapist in- 
dicate that shy and “good” children as 
well as aggressive and troublesome ones 
were included, Of the 17 cases, 16 were 
maladjusted on some Rogers Test index 
at the time of referral, and 10 were on 
the maladjusted side of the scale on Sen- 
tences Mean Adjustment Rating. While 
there was considerable variation in the 
group's initial test performance, the 
teachers’ opinions generally appear reas- 
onable. 

We have noted statistically significant 
therapy changes on both Rogers and Sen 
tences tests, but this does not guarantee 
practical significance, which depends 
upon the size of the change in relation to 
the amount of effort expended in bring- 
ing it about. If some other more economi- 
cal procedure than individual client-cen- 
tered therapy were to result in similar 
changes, our data would have little prac- 
tical importance. We do not have com- 
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parative information on other proce- 
dures, so that an attempt to assess the 
amount of our changes take 
another form. 

On the Rogers Test, the average Total 
Score improvement during therapy is 
5-88 points, which is 14°, of the mean 
pretherapy score of 41.46. But the SD 
of pretherapy scores is 9.72 points, so 
that the mean therapy change represents 
less than 1 SD. Thus, while mean therapy 
change is reliable, it is small in relation 
to variability in pretherapy scores. On 
the Sentence Completion Test, the av- 
erage Mean Adjustment Rating improve- 
ment during therapy is 1.30 scale inter- 
vals, which is 29°, of the mean prether- 
apy rating of 4.54. Since the SD of pre- 
therapy ratings is o.52, the therapy 
change represents 2.50 SD. Thus, therapy 
change is large in relation to variability 
in pretherapy ratings. These compari- 
sons imply greater change on the projec- 
tive than on the objective test, which 
runs counter to the idea that projective 
tests measure the “deeper” personality. 
We should expect more change on an 
objective test, which presumably taps 
more superficial aspects. Perhaps the dis- 
tinction is less applicable to children; 
but, even so, this would not explain the 
direct reversal of the prediction. 
for moment that 


the Sentence Completion Test is a valid 


Let us assume the 
instrument, although this is entirely un 
known. At the pre-wait examination, our 
average therapy subject looks more mal- 
adjusted on it than on the Rogers Test. 
If the projective test is indeed more sen- 
sitive to the presence of psychological 
disturbance, then it would not be sur- 
prising if it were also more sensitive to 
slight changes in such disturbance. Some 
evidence may be found in the pretherapy 


period, where the group grew reliably 
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worse on Sentences Mean Adjustment 
Rating, but not on any Rogers scale. 
Therefore, the projective test’s showing 
of greater change is not restricted to the 
period of therapy. During the follow-up 
period, neither test changed reliably. 
We therefore infer that when a change 
is great enough to show at all on the 
Rogers Test, it is apt to be more marked 
on the Sentence Completion Test. This 
inference would have to be tested on ad- 
ditional samples before real confidence 
could be placed in it, but it is interesting 
enough to warrant further study. 

Lack of reliable test improvements 
prior to therapy agrees with findings by 
Grummon (13) in a group of 2g adult 
client-centered cases studied over a 60- 
day no-therapy period, But our own re 
sults are more extreme: the Sentences 
Mean Adjustment Rating shows actual 
adjustment decrement. Since we found 
no such decrement on the Rogers Test, 
this suggests that some peculiarity of the 
Sentence Completion Test, rather than 
of our client group, is involved. Whether 
the Sentence Completion Test is more 
sensitive to minor changes, or whether 
it measures some different aspect of per- 
sonality adjustment from that sampled 
by the Rogers Test, is uncertain. 

Recent work by Barron and Leary (3) 
contradicts both our own and Grum- 
mon’s (13) results. They found reliable 
MMPI improvements in a group of neu- 
rotic adults waiting for therapy, and no 
difference between their mean changes 
and those in a group undergoing an- 
alytically oriented treatment. However, 
ethical scruples prevented random as- 
signment to control and experimental 
groups, so that those expected to regress 
without treatment were more likely to be 
placed in the treatment group. Moreover, 
an important difference between their 
study and ours is that, during the pre- 
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therapy period, our children had no idea 
that they would later receive therapy. 
\lso, they were selected by teachers, not 
by self-referral, so that the variable of 
motivation for change is not a factor in 
the sense in which it might operate to 
produce changes in a self-selected group. 

The unknown validity of the Sentence 
Completion Test is a serious shortcom- 
ing of this study. We had originally 
hoped to attack this problem by having 
teachers select the five best-adjusted as 
well as the five worst-adjusted children in 
their classes. A very crude validity check 
was to have been made by comparing 
protocols of these two groups. But it 
turned out that the children considered 
best-adjusted were nearly all girls, while 
those judged worst-adjusted were mainly 
boys, making it fruitless to compare their 
records for validity purposes. It is un- 
likely that these judgments simply reflect 
female teachers’ prejudices, as it is com- 
mon experience that boys tend to show 
more frequent emotional problems than 
girls (16). 

One of our’ findings is particularly at 
variance with expectations from research 
in adult counseling. We found no ther- 
apy change in relative number of posi- 
tive, negative, and neutral attitudes on 
the Sentence Completion Test. Seeman’s 
protocol analysis of 10 adult client-cen- 
tered clear shift from 
negative to positive attitudes as therapy 
progressed (24). Perhaps there is a real 
difference 


cases showed a 


in this regard between the 
therapy of children and adults. As a mat- 
ter of fact, increased expression of nega- 
tive feelings characterized the four play 
therapy cases studied by Landisberg and 
Snyder (15). Our results do not 
agree with theirs, but there is an age 


own 


difference of about four years involved. 
Moreover, attitudes expressed in therapy 
sessions may not have the same signifi- 


cance as those shown on a psychological 
test. But since the present study shows 
reliable shifts in the adjustment ratings 
despite absence of change in distribution 
of positive, negative, and neutral atti- 
tudes, it is possible that the latter cate- 
gories are too crude or general to reflect 
test changes. 

Perhaps the greatest shortcoming of 
the present study is the lack of behay 
ioral criteria of changes attributable to 
therapy. Improvements noted on_ psy- 
chological tests do not necessarily mean 
better interpersonal relatians, freer use 
of intellectual capacities, more mature 
behavior, or the achievement of other 
goals generally implicit in the process of 
therapy. To measure these requires ob 
servation of the child outside the ther- 
apy and testing situations, which was 
unfortunately beyond the scope of this 
investigation. But now that we have 
evidence of some test improvements, the 
next step might well be the conduct of 
studies directed toward the problem of 
behavior in other situations, This is par 
ticularly important in view of the report 
by Teuber and Powers (29). In a large 
scale study of therapy outcomes among 
predelinquents in a social work setting, 
they found no difference in later de- 
linquency of therapy and control cases, 
despite the optimistic views of clients 
and therapists. 

In a child therapy study, behavioral 
criteria might come from teacher and 
parent interviews. This was impossible 
in our case because the single therapist 
was fully occupied with the therapy ses 
sions. Another disadvantage of the use ol 
a single therapist is that it prevents gen 
eralization from the results, which can 
be a function of either the particular 
therapeutic system or the therapist's per 
sonality. Future research in this area 
might better be planned as a group proj- 
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ect, 


to overcome some of these limita- 
tions. 

In general, we find that psychotherapy, 
within the limits of our criteria, can be 
effective in a school setting, and without | 
parent counseling. This is not to deny 
possible benefits from such counseling. 
But many parents cannot accept the idea 
of help for themselves, even while al- 
lowing their children to receive it. If 
the clinician’s choice lies between refus- 
ing such cases altogether and trying to 
work with the child alone, our results 
suggest that the latter 
beneficial possibilities. 


alternative has 


SUMMARY AND CONCLUSIONS 


This project had as its aim the assess- 
ment of personality outcomes of indi- 
vidual client-centered child therapy by 
means of psychological tests, therapist 
judgments, and follow-up letters. The 
hypothesis was that personality changes 
occur during a therapy period which do 
not occur in the same child during a 
no-therapy period, and which do not 
occur in a control group. 

The basic experimental design in- 
volved observation during three time 
periods for the therapy group of 17 
cases: (a) pretherapy or control period; 
(b) therapy period; (c) follow-up or post- 
therapy period, Seventeen control cases 
were tested twice, over a time interval 
close to the average length of therapy, as 
a way of controlling for time. Criterion 
measures were therapist judgment, an 
objective test, a projective test, and fol- 
low-up letters. The last two were found 
to have adequate intrascorer and inter- 
scorer reliability, the figures for which 
can be examined in the Appendix. Ther- 
apy was conducted in school by a single 
female therapist, the investigator, who 
did no testing. 


The average length of therapy was seen 


| 
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to be 19 sessions, with 10 of the 17 cases 
considered relatively successful by the 
therapist. The therapist's judgment was 
unrelated to length of therapy, method 
of termination (by child or therapist), 
or sex of the client. Nor were there any 
sex differences in duration of therapy o1 
in termination method. 

On the Rogers Test, we noted certain 
therapy improvements reliably greater 
than those of the pretherapy period and 
of the separate control group. While the 
process of change did not continue in 
the follow-up period, the gains of therapy 
were maintained, Test results were unre- 
lated to the sex of the child, the duration 
of therapy, or to termination method, 
but they were related to the therapist's 
judgment in the case of the Rogers Test. 

‘The Sentence Completion Test showed 
great stability of Themes and Attitudes, 
regardless of experimental conditions. 
Mean Adjustment Rating showed gains 
during therapy, and these were reliably 
greater than those of the pretherapy in 
terval and the separate control group. 
The 
across the follow-up period, but no sig 
nificant 


therapy gains were maintained 


improvements occurred once 
therapy had stopped. Subgroup compari- 
sons were minor. Results on the Sentence 
Completion Test were unrelated to the 
therapist's judgment of outcome. Like- 
wise, they bore no relation to the dura 
tion of therapy. 

Follow-up letters showed that positive 
(tavorable) memories of the therapy ex- 


perience predominated, but that no re 


liable trends existed for reports of pres- 


ent life status. Moreover, the therapy 
memories and reported status were un- 
related. On an inspection basis, no sub 
group differences appeared, 


The three measures ot therapy out 


DORFMAN 


uncorrelated. 
This suggests that if each examines some 


comes were found to be 
aspect of personal adjustment, it is likely 


to be a different facet of the criterion 
universe. 

Our conclusions are as follows. 

1. Reliable test 


concomitantly with a series of therapy 


improvements 


sessions, 

2. Time alone does not produce reli- 
able improvements on our tests. While 
individuals may show “‘spontaneous re- 
mission,” the group as a whole does not. 

3. Our results fail to support the view 
that therapy starts a process of change 
which the 


cease. Therapy gains remain a year later, 


continues after interviews 
but they do not grow. 

j. Despite the emotional dependence 
of children upon parents, therapy im- 
provements occu! without parent coun. 
seling. 

5. Projective test improvements ap- 
pear greater than those on an objective 
test. 

6. Since our projective test is not a 
validated instrument, it may be a poorer 
index than the objective test, which has 
known validity. 

7. Effective therapy can be done in a 
school setting, insofar as tests may meas- 
ure outcomes. 

8. Sex differences are notably absent. 

g. Within the limits of our procedures, 
therapy gains are not proportionate to 
the number of sessions. 

10. No general claims for client-cen- 
this 


study since all children were seen by the 


tered therapy can be made from 
same therapist. 

11. Owing to the absence of behavioral 
data, we do not know whether test im 
provements reflect actual changes in lite 
adjustments. 


< 
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Theme Analysts 


Each completion is assigned to one of 12 con- 
tent categories, chosen to cover the main areas 
of living of concern to children—work, play, 
health, the future, the self, and persons impor 
tant in the child’s life. Category 7, Animals, is 
included because a random sample of 50 records 
drawn from the research group's schoolmates 


showed frequent mention of animals. The list 
of themes follows. 

1. Self 7. Animals 

2. Siblings 8. School 

3. Mother 9. Recreation 

Father 10. Sickness-Death 

5. Family 11. Future 

6. Others 12. Miscellaneous 


The principles governing categorization, defi- 
nitions of themes, and scoring samples are given 
in detail in the original 


manual (7, pp. 120 
iss 


Attitude Analysis 


Each completion is classified as Positive, Nega 
tive, or Neutral, depending upon its affective 
This is independent of the theme category 
That is, a completion may, for example, be 
placed in a theme category which ordinarily im 
plies a positive attitude and yet be given a Nega 
tive attitude score. 


tone 


Adjustment Rating Analysis 


This consists of making 


characteristics, each of which 


judgments on 10 
is evaluated on a 


seven-point scale. Each scale represents a con 
tinuum from best to worst adjustment, with 
higher ratings meaning worse adjustment. The 


10 qualities are chosen to include only those apt 
to show change if therapy occurs, and all to tap 
some aspect of personal adjustment. An eleventh 
index, Mean Adjustment Rating, is calculated 
for each record by averaging ratings on the 10 
This is our index of general adjustment status 
The list of 10 qualities, together with abbrevi- 
ated detinitions, is given below 

1. Anxiety—a feeling of painful uneasiness 
regarding the future, characterized by mingled 
or alternating dread and hope (30, p. 16) 

2. Securitv—a condition of present saltety, as 
surance, comfort, and being loved 

3. Dependence—a condition of being unable 
to sustain or help oneself, or to perform a given 
act, without the consent or aid of some external 
person or group 


force 


APPENDIX 


SUMMARY OF SCORING METHOD AND RELIABILITY 


SENTENCE COMPLETION TEST 


4. Conflict—an unpleasant emotional state re 
sulting from opposed tendencies or desires, each 
of which is of approximately equal strength. 

5. Affectivity—the disposition to emotional ex- 
perience, the capacity to be moved to feelings 
by internal or external events 

6. Flexibility 
tudes or 


the capacity to alter one’s atti 
behavior in adaptation to changing 
circumstances 
Spontaneity—the capacity to initiate be 

havior on the basis of inner feeling, proneness, 
or temperament, as contrasted with the need for 
constraint or immediate external influence. 

8. Self-regard—the general disposition to react 
with approval or disapproval toward oneself 

g. Attitude Toward People—the general dis 
position to react with approval or disapproval 
toward persons, exclusive of family members. 

10. Family Attitude—the general disposition 
to react with approval or disapproval toward the 


family unit, as well as its members ‘ 
In the original scoring manual (7, pp. 120 
188), each of the above qualities is defined in 


greater detail, both in general and in terms of 
its extremes. A list of Sentence Completion Test 


characteristics which may 


contribute the 
rating also accompanies each definition. Like 


wise, each of the seven scale positions is defined 
Reliability 


All reliability determinations are based upon 
the same drawn from the 
control and experimental cases, except where 
otherwise noted 


10 records, randomly 


Split-Half Reliability 


The test is divided into halves by matching 
stems for presumed Attitude stimulus value. 
For each half of each protocol, we total the 
number of completions falling into each of our 
three Attitude categories. Chi tests the 
hypothesis that the halves are samples from a 
common The 1:0 individual chi 
squares (one for each of 10 records) are summed 
to give S x 15.92, P < 8o, thus affording no 
basis for rejecting the hypothesis. Pooled y? 
0.40, < 15.52, p < -70, 
so that the pooled figure is an adequate estimate 
(27, pp. 188-192). The test halves are probably 
alike in Attitude categories elicited 


square 


population 


go, and Interaction x 


The split-half analysis is not made for Themes 
since the test has no equivalent halves for them 
It is also irrelevant to the Adjustment Ratings, 
which do not depend on the same test items in 
all records 
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TABLE 9 


INTERCORRELATIONS OF SENTENCES ADJUSTMENT RATING SCALES AT 
First TESTING OF 34 SUBJECTS* 


Secur- |Depend-| Con- 


Scales iety ity ence flict tivity | bility | taneity Self | People — 
Aspects of Adjustment 

Security 80 | 

Dependence 50 

Conflict _ .49 

Affectivity 44 36 27 .26 

Flexibility 15 .07 ol 02 02 

Spontaneity .32 30 og 19 

Self 68 .60 40 39 29 25 30 

People .42 .29 17 31 15 45 

Family 3 43 


-All 


Mean Rating” 75 


.690 


59 


minus that subtest (17, p. 139). 


Intercorrelations of Rated Adjustment 
Characteristics 


Although the scoring manual (7, pp. 120-188 
directs separate consideration of each character 
istic, it is possible that some global quality is 
being rated under different names. To test this, 
we intercorrelate ratings on all the scales, by 
combining the 17 experimental and 17 control 
cases at first testing, to give N = 94. Table 9 
gives the Pearson r’s, and shows that 22 of the 
{5 interscale r’s (excluding Mean Rating) are 
significant at the .o5 level or better 34). 
so that we surely do not have 10 discrete meas 
ures. Our aim was not to predict changes on 
these individual scales, but to use them to, derive 
a general notion of adjustment status, that em 
bodied in the Mean Rating. Table g shows that 
g of the 10 scales correlate with Mean Rating at 
the .o5 level or better, indicating internal con- 
sistency. Thus, the general adjustment index is 
based upon converging measures. While we might 
have used a direct assessment of over-all ad 
justment instead of Mean Rating, previous re 
search with a similar test showed a global rating 
to be less reliable and valid than a score derived 
from specific items (31). 


(r= 


Rescoring Reliability 


The 
months. 


writer 


rescored 10 
Agreement is 
chance expectancy is 8 
For 


records 
for Themes 


after six 
whereas 
i. since 12 categories are 


involved Attitudes, the agreement is 93%: 


Adjustment 


* For df =32, the following r values are required at the given ?p levels: 
(b) for p<.o1, r=.44; (c) for p<.02, r=.40; (d) for p<.05, r=.34 
» Corrected for contamination by McNemar's formula for r between a 


48 — .09 36 .62 5° »43 


(a) for p<.oo1, r=.54; 


subtest and a total test 


while chance expectancy is 33°,,, 


since three cate 
gories are involved. Both of the obtained intra- 
judge agreements are significant beyond the .oo1 
level. 

For the adjustment rating analysis, the Pear- 
son r between the two sets of pooled ratings 
significant at the level. On the 10 
individual scales, rho ranges from .69 to .gg6, and 
all are significant at or bevond the .o5 level. 
Thus, all the scales can be reliably used by the 
single scorer 


is .gt, 


Interjudge Reliability 


To check the experimenter's work, two outside 
judges evaluated the same to cases. Their train- 
ing for the task consisted of study of the scoring 
manual, which includes a demonstration analysis 
of one case. Then each judge scored two sample 
cases as a training exercise. The sample analyses 
were carefully studied by the investigator who 
then gave detailed criticisms to each judge in- 
dividually. 

The work of three judges is compared. Judge 
4, the writer, had worked intensively with dis- 
turbed children, and had been trained in other 
projective tests. Judge B, a graduate student, was 
mainly an adult therapist, but had used non- 
projective tests with children. Judge C, a clinical 
psychologist, had diagnostic and therapeutic ex- 
perience with children and adults, but projective 
test experience with adults only. Thus, all judges 
had clinical experience with children, but with 
methods other than sentence completions. 


ver” 
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Based on 10 records, interjudge agreement on 
Themes ranges from 76°, to 86°,; agreement on 
Attitudes ranges from to These per- 
centages differ significantly from chance expect 
ancy at the .oo1 level. Pearson r’s for the pooled 
Adjustment Ratings range from .64 to .78, all 


significant at the .oo1 level. Agreement on in 


dividual scales is tested by the W statistic, which 
ranges from .53 to .89, significant at the .o2 level 
or better for 8 of the 10 scales, the 2 exceptions 
Mean 


being Dependence and Flexibility. For 


Therapy Memories 


Each letter is classified Positive, Negative, o1 
Neutral, depending upon its affective tone 

1. Positive—expressions of pleasant memories, 
fun, benefit derived, or that the therapist is 
missed. 

2. Negative—expressions of displeasure about 
the sessions, statements denying their value, o1 
hostile demands for explanations of their pur 
pose. 

3. Neutral—mere recitals of recalled activities, 
without statements implying approval or dis 
approval. 


Reports of Present Status 


Each letter is judged Positive, Negative, o1 
Neutral for reported life status, depending upon 
the degree of well-being or personal satisfaction 
expressed. 

1. Positive—statements that the child is doing 
well, whether in terms of personal happiness o1 
school progress; favorable references to currently 
enjoved activities. 

2. Negative—accounts of failure experiences, 
unhappiness, or generally poor progress 

3. Neutral—noncommittal remarks about cur 
rent activities, as well as complete omission of 
reference to present life status 


CLIENT-CENTERED CHILD 


SUMMARY OF SCORING METHOD AND RELIABILITY 
OF THE FOLLOW-UP LETTERS 


THERAPY 


Rating, our general adjustment index, W — 82 
(p< .o1) 


The disagreement on Dependence and Flexi- 
bility is general, and is not a function of any 
particular pair of judges. Rho coefficients show 
absence of between any pair of 
judges Judge A, the investigator, could 
reliably use these scales, lack of interjudge agree 
ment may reflect 
manual, 


correlation 
Since 


inadequacies in the scoring 


Reliability 


All reliability determinations are based upon 
the total group of 15 cases followed up 


Rescoring Reliability 


The writer analyzed each letter twice, with a 


six-month interval between. Intrajudge agree 


therapy memories and 93° 
pre sent 


ment is for 


for reports of status, while chance ex- 


three categories are in- 
The obtained agreement differs signifi 
cantly from chance expectancy at the .oo1 


pectancy Is 38 since 


volved 
level. 
Interjudge Reliability 


Three judges independently classified all let 
ters, working from the scoring manual (7, pp 
189-192). Judge A is the writer. Judge B is a 
clinical psychologist with four years of diagnostic 
and therapeutic with and 
adults. Judge C is a graduate student in the 
humanities, without prior experience of this kind 
Agreement between pairs of judges 
tween 87° 


experience children 


ranges be 
on therapy memories, and 
reported staius agreements are all 80°. All differ 
significantly from chance expectancy at the .oo1 
level. Judges B and ¢ 


and 43 


have identical agreement 
percentages with the writer, suggesting that the 
task of judging this material does not require 
training bevond that given in the scoring manual 
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